Overview

Dataset statistics

Number of variables24
Number of observations129487
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory23.7 MiB
Average record size in memory192.0 B

Variable types

Numeric18
Categorical6

Alerts

Seat comfort is highly correlated with Food and drinkHigh correlation
Departure_Arrival time convenient is highly correlated with Food and drink and 1 other fieldsHigh correlation
Food and drink is highly correlated with Seat comfort and 2 other fieldsHigh correlation
Gate location is highly correlated with Departure_Arrival time convenient and 1 other fieldsHigh correlation
Inflight wifi service is highly correlated with Online support and 2 other fieldsHigh correlation
Online support is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Ease of Online booking is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Onboard service is highly correlated with Baggage handling and 1 other fieldsHigh correlation
Baggage handling is highly correlated with Onboard service and 1 other fieldsHigh correlation
Cleanliness is highly correlated with Onboard service and 1 other fieldsHigh correlation
Online boarding is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Departure Delay in Minutes is highly correlated with Arrival Delay in MinutesHigh correlation
Arrival Delay in Minutes is highly correlated with Departure Delay in MinutesHigh correlation
Seat comfort is highly correlated with Food and drinkHigh correlation
Departure_Arrival time convenient is highly correlated with Food and drink and 1 other fieldsHigh correlation
Food and drink is highly correlated with Seat comfort and 2 other fieldsHigh correlation
Gate location is highly correlated with Departure_Arrival time convenient and 1 other fieldsHigh correlation
Inflight wifi service is highly correlated with Online support and 2 other fieldsHigh correlation
Online support is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Ease of Online booking is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Onboard service is highly correlated with Baggage handling and 1 other fieldsHigh correlation
Baggage handling is highly correlated with Onboard service and 1 other fieldsHigh correlation
Cleanliness is highly correlated with Onboard service and 1 other fieldsHigh correlation
Online boarding is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Departure Delay in Minutes is highly correlated with Arrival Delay in MinutesHigh correlation
Arrival Delay in Minutes is highly correlated with Departure Delay in MinutesHigh correlation
Seat comfort is highly correlated with Food and drinkHigh correlation
Departure_Arrival time convenient is highly correlated with Gate locationHigh correlation
Food and drink is highly correlated with Seat comfortHigh correlation
Gate location is highly correlated with Departure_Arrival time convenientHigh correlation
Inflight wifi service is highly correlated with Ease of Online booking and 1 other fieldsHigh correlation
Online support is highly correlated with Ease of Online booking and 1 other fieldsHigh correlation
Ease of Online booking is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Onboard service is highly correlated with CleanlinessHigh correlation
Baggage handling is highly correlated with CleanlinessHigh correlation
Cleanliness is highly correlated with Onboard service and 1 other fieldsHigh correlation
Online boarding is highly correlated with Inflight wifi service and 2 other fieldsHigh correlation
Departure Delay in Minutes is highly correlated with Arrival Delay in MinutesHigh correlation
Arrival Delay in Minutes is highly correlated with Departure Delay in MinutesHigh correlation
Class is highly correlated with Type of TravelHigh correlation
Type of Travel is highly correlated with ClassHigh correlation
df_index is highly correlated with satisfaction and 10 other fieldsHigh correlation
satisfaction is highly correlated with df_index and 5 other fieldsHigh correlation
Customer Type is highly correlated with df_indexHigh correlation
Type of Travel is highly correlated with df_indexHigh correlation
Class is highly correlated with df_indexHigh correlation
Seat comfort is highly correlated with df_index and 5 other fieldsHigh correlation
Departure_Arrival time convenient is highly correlated with Seat comfort and 2 other fieldsHigh correlation
Food and drink is highly correlated with Seat comfort and 3 other fieldsHigh correlation
Gate location is highly correlated with Seat comfort and 2 other fieldsHigh correlation
Inflight wifi service is highly correlated with Online support and 2 other fieldsHigh correlation
Inflight entertainment is highly correlated with df_index and 5 other fieldsHigh correlation
Online support is highly correlated with satisfaction and 5 other fieldsHigh correlation
Ease of Online booking is highly correlated with df_index and 8 other fieldsHigh correlation
Onboard service is highly correlated with df_index and 5 other fieldsHigh correlation
Leg room service is highly correlated with df_index and 3 other fieldsHigh correlation
Baggage handling is highly correlated with df_index and 3 other fieldsHigh correlation
Checkin service is highly correlated with Online supportHigh correlation
Cleanliness is highly correlated with df_index and 4 other fieldsHigh correlation
Online boarding is highly correlated with Inflight wifi service and 3 other fieldsHigh correlation
Departure Delay in Minutes is highly correlated with Arrival Delay in MinutesHigh correlation
Arrival Delay in Minutes is highly correlated with Departure Delay in MinutesHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique
Seat comfort has 4781 (3.7%) zeros Zeros
Departure_Arrival time convenient has 6644 (5.1%) zeros Zeros
Food and drink has 5922 (4.6%) zeros Zeros
Inflight entertainment has 2968 (2.3%) zeros Zeros
Departure Delay in Minutes has 73209 (56.5%) zeros Zeros
Arrival Delay in Minutes has 72753 (56.2%) zeros Zeros

Reproduction

Analysis started2021-12-07 09:30:11.381969
Analysis finished2021-12-07 09:31:35.056992
Duration1 minute and 23.68 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct129487
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64949.17704
Minimum0
Maximum129879
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile6506.3
Q132487.5
median64952
Q397413.5
95-th percentile123385.7
Maximum129879
Range129879
Interquartile range (IQR)64926

Descriptive statistics

Standard deviation37486.7854
Coefficient of variation (CV)0.5771710607
Kurtosis-1.199720117
Mean64949.17704
Median Absolute Deviation (MAD)32463
Skewness-0.0001640227638
Sum8410074087
Variance1405259080
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
865861
 
< 0.1%
865991
 
< 0.1%
865981
 
< 0.1%
865971
 
< 0.1%
865961
 
< 0.1%
865951
 
< 0.1%
865941
 
< 0.1%
865931
 
< 0.1%
865921
 
< 0.1%
Other values (129477)129477
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
1298791
< 0.1%
1298781
< 0.1%
1298771
< 0.1%
1298761
< 0.1%
1298751
< 0.1%
1298741
< 0.1%
1298721
< 0.1%
1298711
< 0.1%
1298701
< 0.1%
1298691
< 0.1%

satisfaction
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1011.7 KiB
satisfied
70882 
dissatisfied
58605 

Length

Max length12
Median length9
Mean length10.35778109
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsatisfied
2nd rowsatisfied
3rd rowsatisfied
4th rowsatisfied
5th rowsatisfied

Common Values

ValueCountFrequency (%)
satisfied70882
54.7%
dissatisfied58605
45.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
satisfied70882
54.7%
dissatisfied58605
45.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1011.7 KiB
Female
65703 
Male
63784 

Length

Max length6
Median length6
Mean length5.014820021
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowFemale
4th rowFemale
5th rowFemale

Common Values

ValueCountFrequency (%)
Female65703
50.7%
Male63784
49.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
female65703
50.7%
male63784
49.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Customer Type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1011.7 KiB
Loyal Customer
105773 
disloyal Customer
23714 

Length

Max length17
Median length14
Mean length14.54941423
Min length14

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLoyal Customer
2nd rowLoyal Customer
3rd rowLoyal Customer
4th rowLoyal Customer
5th rowLoyal Customer

Common Values

ValueCountFrequency (%)
Loyal Customer105773
81.7%
disloyal Customer23714
 
18.3%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
customer129487
50.0%
loyal105773
40.8%
disloyal23714
 
9.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Age
Real number (ℝ≥0)

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.42876119
Minimum7
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum7
5-th percentile15
Q127
median40
Q351
95-th percentile64
Maximum85
Range78
Interquartile range (IQR)24

Descriptive statistics

Standard deviation15.11759674
Coefficient of variation (CV)0.3834154634
Kurtosis-0.718736869
Mean39.42876119
Median Absolute Deviation (MAD)12
Skewness-0.003376447203
Sum5105512
Variance228.5417312
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
393681
 
2.8%
253501
 
2.7%
403203
 
2.5%
443099
 
2.4%
413081
 
2.4%
423011
 
2.3%
432936
 
2.3%
232932
 
2.3%
452927
 
2.3%
222926
 
2.3%
Other values (65)98190
75.8%
ValueCountFrequency (%)
7682
0.5%
8793
0.6%
9852
0.7%
10820
0.6%
11831
0.6%
12793
0.6%
13805
0.6%
14857
0.7%
151003
0.8%
161153
0.9%
ValueCountFrequency (%)
8525
 
< 0.1%
80110
0.1%
7952
 
< 0.1%
7844
 
< 0.1%
77106
0.1%
7660
 
< 0.1%
7576
 
0.1%
7461
 
< 0.1%
7367
 
0.1%
72248
0.2%

Type of Travel
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1011.7 KiB
Business travel
89445 
Personal Travel
40042 

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPersonal Travel
2nd rowPersonal Travel
3rd rowPersonal Travel
4th rowPersonal Travel
5th rowPersonal Travel

Common Values

ValueCountFrequency (%)
Business travel89445
69.1%
Personal Travel40042
30.9%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
travel129487
50.0%
business89445
34.5%
personal40042
 
15.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Class
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1011.7 KiB
Business
61990 
Eco
58117 
Eco Plus
9380 

Length

Max length8
Median length8
Mean length5.755875107
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEco
2nd rowBusiness
3rd rowEco
4th rowEco
5th rowEco

Common Values

ValueCountFrequency (%)
Business61990
47.9%
Eco58117
44.9%
Eco Plus9380
 
7.2%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
eco67497
48.6%
business61990
44.6%
plus9380
 
6.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Flight Distance
Real number (ℝ≥0)

Distinct5397
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1981.008974
Minimum50
Maximum6951
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum50
5-th percentile341
Q11359
median1924
Q32543
95-th percentile3830
Maximum6951
Range6901
Interquartile range (IQR)1184

Descriptive statistics

Standard deviation1026.884131
Coefficient of variation (CV)0.5183641996
Kurtosis0.3649469556
Mean1981.008974
Median Absolute Deviation (MAD)594
Skewness0.466458621
Sum256514909
Variance1054491.019
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
196392
 
0.1%
181287
 
0.1%
163987
 
0.1%
178986
 
0.1%
198185
 
0.1%
175983
 
0.1%
176683
 
0.1%
174882
 
0.1%
176981
 
0.1%
202281
 
0.1%
Other values (5387)128640
99.3%
ValueCountFrequency (%)
5023
< 0.1%
5121
< 0.1%
5220
< 0.1%
5328
< 0.1%
5421
< 0.1%
5522
< 0.1%
5630
< 0.1%
5721
< 0.1%
5815
< 0.1%
5924
< 0.1%
ValueCountFrequency (%)
69511
< 0.1%
69501
< 0.1%
69481
< 0.1%
69241
< 0.1%
69072
< 0.1%
68891
< 0.1%
68821
< 0.1%
68681
< 0.1%
68651
< 0.1%
68371
< 0.1%

Seat comfort
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.838586113
Minimum0
Maximum5
Zeros4781
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.392873338
Coefficient of variation (CV)0.4906926487
Kurtosis-0.9430386431
Mean2.838586113
Median Absolute Deviation (MAD)1
Skewness-0.0918405445
Sum367560
Variance1.940096137
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
329096
22.5%
228645
22.1%
428315
21.9%
120882
16.1%
517768
13.7%
04781
 
3.7%
ValueCountFrequency (%)
04781
 
3.7%
120882
16.1%
228645
22.1%
329096
22.5%
428315
21.9%
517768
13.7%
ValueCountFrequency (%)
517768
13.7%
428315
21.9%
329096
22.5%
228645
22.1%
120882
16.1%
04781
 
3.7%

Departure_Arrival time convenient
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.990277016
Minimum0
Maximum5
Zeros6644
Zeros (%)5.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.527183318
Coefficient of variation (CV)0.5107163349
Kurtosis-1.089538944
Mean2.990277016
Median Absolute Deviation (MAD)1
Skewness-0.2519380042
Sum387202
Variance2.332288887
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
429504
22.8%
526723
20.6%
323110
17.8%
222735
17.6%
120771
16.0%
06644
 
5.1%
ValueCountFrequency (%)
06644
 
5.1%
120771
16.0%
222735
17.6%
323110
17.8%
429504
22.8%
526723
20.6%
ValueCountFrequency (%)
526723
20.6%
429504
22.8%
323110
17.8%
222735
17.6%
120771
16.0%
06644
 
5.1%

Food and drink
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.852023755
Minimum0
Maximum5
Zeros5922
Zeros (%)4.6%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.443587344
Coefficient of variation (CV)0.5061624544
Kurtosis-0.9866208473
Mean2.852023755
Median Absolute Deviation (MAD)1
Skewness-0.1165931268
Sum369300
Variance2.08394442
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
328065
21.7%
427129
21.0%
227078
20.9%
121008
16.2%
520285
15.7%
05922
 
4.6%
ValueCountFrequency (%)
05922
 
4.6%
121008
16.2%
227078
20.9%
328065
21.7%
427129
21.0%
520285
15.7%
ValueCountFrequency (%)
520285
15.7%
427129
21.0%
328065
21.7%
227078
20.9%
121008
16.2%
05922
 
4.6%

Gate location
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.990377412
Minimum0
Maximum5
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.305917291
Coefficient of variation (CV)0.4367065125
Kurtosis-1.089690106
Mean2.990377412
Median Absolute Deviation (MAD)1
Skewness-0.05307977321
Sum387215
Variance1.70541997
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
333451
25.8%
429997
23.2%
224441
18.9%
122497
17.4%
519099
14.7%
02
 
< 0.1%
ValueCountFrequency (%)
02
 
< 0.1%
122497
17.4%
224441
18.9%
333451
25.8%
429997
23.2%
519099
14.7%
ValueCountFrequency (%)
519099
14.7%
429997
23.2%
333451
25.8%
224441
18.9%
122497
17.4%
02
 
< 0.1%

Inflight wifi service
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.249160147
Minimum0
Maximum5
Zeros130
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.318764761
Coefficient of variation (CV)0.4058786583
Kurtosis-1.121498672
Mean3.249160147
Median Absolute Deviation (MAD)1
Skewness-0.1911962779
Sum420724
Variance1.739140495
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
431474
24.3%
528738
22.2%
327518
21.3%
226957
20.8%
114670
11.3%
0130
 
0.1%
ValueCountFrequency (%)
0130
 
0.1%
114670
11.3%
226957
20.8%
327518
21.3%
431474
24.3%
528738
22.2%
ValueCountFrequency (%)
528738
22.2%
431474
24.3%
327518
21.3%
226957
20.8%
114670
11.3%
0130
 
0.1%

Inflight entertainment
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.383745086
Minimum0
Maximum5
Zeros2968
Zeros (%)2.3%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.345959402
Coefficient of variation (CV)0.3977721039
Kurtosis-0.5322206039
Mean3.383745086
Median Absolute Deviation (MAD)1
Skewness-0.6050587183
Sum438151
Variance1.811606712
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
441752
32.2%
529748
23.0%
324133
18.6%
219118
14.8%
111768
 
9.1%
02968
 
2.3%
ValueCountFrequency (%)
02968
 
2.3%
111768
 
9.1%
219118
14.8%
324133
18.6%
441752
32.2%
529748
23.0%
ValueCountFrequency (%)
529748
23.0%
441752
32.2%
324133
18.6%
219118
14.8%
111768
 
9.1%
02968
 
2.3%

Online support
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.519967255
Minimum0
Maximum5
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.306325613
Coefficient of variation (CV)0.3711186831
Kurtosis-0.8095990054
Mean3.519967255
Median Absolute Deviation (MAD)1
Skewness-0.5758473896
Sum455790
Variance1.706486606
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
441406
32.0%
535451
27.4%
321543
16.6%
217196
13.3%
113890
 
10.7%
01
 
< 0.1%
ValueCountFrequency (%)
01
 
< 0.1%
113890
 
10.7%
217196
13.3%
321543
16.6%
441406
32.0%
535451
27.4%
ValueCountFrequency (%)
535451
27.4%
441406
32.0%
321543
16.6%
217196
13.3%
113890
 
10.7%
01
 
< 0.1%

Ease of Online booking
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.472170952
Minimum0
Maximum5
Zeros18
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.305572646
Coefficient of variation (CV)0.3760104741
Kurtosis-0.9104928388
Mean3.472170952
Median Absolute Deviation (MAD)1
Skewness-0.4919011447
Sum449601
Variance1.704519933
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
439807
30.7%
534034
26.3%
322344
17.3%
219887
15.4%
113397
 
10.3%
018
 
< 0.1%
ValueCountFrequency (%)
018
 
< 0.1%
113397
 
10.3%
219887
15.4%
322344
17.3%
439807
30.7%
534034
26.3%
ValueCountFrequency (%)
534034
26.3%
439807
30.7%
322344
17.3%
219887
15.4%
113397
 
10.3%
018
 
< 0.1%

Onboard service
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.465143219
Minimum0
Maximum5
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.270755199
Coefficient of variation (CV)0.3667251592
Kurtosis-0.7846678952
Mean3.465143219
Median Absolute Deviation (MAD)1
Skewness-0.5054024865
Sum448691
Variance1.614818775
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
440558
31.3%
531625
24.4%
326959
20.8%
217117
13.2%
113223
 
10.2%
05
 
< 0.1%
ValueCountFrequency (%)
05
 
< 0.1%
113223
 
10.2%
217117
13.2%
326959
20.8%
440558
31.3%
531625
24.4%
ValueCountFrequency (%)
531625
24.4%
440558
31.3%
326959
20.8%
217117
13.2%
113223
 
10.2%
05
 
< 0.1%

Leg room service
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.486118298
Minimum0
Maximum5
Zeros442
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.292079137
Coefficient of variation (CV)0.3706354824
Kurtosis-0.8411628029
Mean3.486118298
Median Absolute Deviation (MAD)1
Skewness-0.4965037365
Sum451407
Variance1.669468496
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
439583
30.6%
534284
26.5%
322397
17.3%
221683
16.7%
111098
 
8.6%
0442
 
0.3%
ValueCountFrequency (%)
0442
 
0.3%
111098
 
8.6%
221683
16.7%
322397
17.3%
439583
30.6%
534284
26.5%
ValueCountFrequency (%)
534284
26.5%
439583
30.6%
322397
17.3%
221683
16.7%
111098
 
8.6%
0442
 
0.3%

Baggage handling
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1011.7 KiB
4
48107 
5
35623 
3
24413 
2
13388 
1
7956 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row4
3rd row4
4th row1
5th row2

Common Values

ValueCountFrequency (%)
448107
37.2%
535623
27.5%
324413
18.9%
213388
 
10.3%
17956
 
6.1%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
448107
37.2%
535623
27.5%
324413
18.9%
213388
 
10.3%
17956
 
6.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Checkin service
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.340729185
Minimum0
Maximum5
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median3
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.260560868
Coefficient of variation (CV)0.3773310552
Kurtosis-0.7935625042
Mean3.340729185
Median Absolute Deviation (MAD)1
Skewness-0.3923632404
Sum432581
Variance1.589013703
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
436372
28.1%
335430
27.4%
526919
20.8%
215443
11.9%
115322
11.8%
01
 
< 0.1%
ValueCountFrequency (%)
01
 
< 0.1%
115322
11.8%
215443
11.9%
335430
27.4%
436372
28.1%
526919
20.8%
ValueCountFrequency (%)
526919
20.8%
436372
28.1%
335430
27.4%
215443
11.9%
115322
11.8%
01
 
< 0.1%

Cleanliness
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.705885533
Minimum0
Maximum5
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.151683387
Coefficient of variation (CV)0.310771441
Kurtosis-0.2077567897
Mean3.705885533
Median Absolute Deviation (MAD)1
Skewness-0.7564386769
Sum479864
Variance1.326374625
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
448665
37.6%
535803
27.6%
323907
18.5%
213361
 
10.3%
17746
 
6.0%
05
 
< 0.1%
ValueCountFrequency (%)
05
 
< 0.1%
17746
 
6.0%
213361
 
10.3%
323907
18.5%
448665
37.6%
535803
27.6%
ValueCountFrequency (%)
535803
27.6%
448665
37.6%
323907
18.5%
213361
 
10.3%
17746
 
6.0%
05
 
< 0.1%

Online boarding
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.352545043
Minimum0
Maximum5
Zeros14
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q34
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.298623949
Coefficient of variation (CV)0.387354661
Kurtosis-0.9378595022
Mean3.352545043
Median Absolute Deviation (MAD)1
Skewness-0.3664784908
Sum434111
Variance1.68642416
MonotonicityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
435079
27.1%
330692
23.7%
529875
23.1%
218517
14.3%
115310
11.8%
014
 
< 0.1%
ValueCountFrequency (%)
014
 
< 0.1%
115310
11.8%
218517
14.3%
330692
23.7%
435079
27.1%
529875
23.1%
ValueCountFrequency (%)
529875
23.1%
435079
27.1%
330692
23.7%
218517
14.3%
115310
11.8%
014
 
< 0.1%

Departure Delay in Minutes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct464
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.64338505
Minimum0
Maximum1592
Zeros73209
Zeros (%)56.5%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q312
95-th percentile77
Maximum1592
Range1592
Interquartile range (IQR)12

Descriptive statistics

Standard deviation37.93286655
Coefficient of variation (CV)2.590443837
Kurtosis101.8829471
Mean14.64338505
Median Absolute Deviation (MAD)0
Skewness6.853577956
Sum1896128
Variance1438.902365
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
073209
56.5%
13671
 
2.8%
22845
 
2.2%
32530
 
2.0%
42298
 
1.8%
52131
 
1.6%
61881
 
1.5%
71745
 
1.3%
81613
 
1.2%
91550
 
1.2%
Other values (454)36014
27.8%
ValueCountFrequency (%)
073209
56.5%
13671
 
2.8%
22845
 
2.2%
32530
 
2.0%
42298
 
1.8%
52131
 
1.6%
61881
 
1.5%
71745
 
1.3%
81613
 
1.2%
91550
 
1.2%
ValueCountFrequency (%)
15921
< 0.1%
13051
< 0.1%
11281
< 0.1%
10171
< 0.1%
9781
< 0.1%
9511
< 0.1%
9331
< 0.1%
9301
< 0.1%
9211
< 0.1%
8591
< 0.1%

Arrival Delay in Minutes
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct472
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.09112884
Minimum0
Maximum1584
Zeros72753
Zeros (%)56.2%
Negative0
Negative (%)0.0%
Memory size1011.7 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q313
95-th percentile78
Maximum1584
Range1584
Interquartile range (IQR)13

Descriptive statistics

Standard deviation38.46565024
Coefficient of variation (CV)2.548891514
Kurtosis95.11711419
Mean15.09112884
Median Absolute Deviation (MAD)0
Skewness6.670124611
Sum1954105
Variance1479.606248
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
072753
56.2%
12747
 
2.1%
22587
 
2.0%
32442
 
1.9%
42373
 
1.8%
52083
 
1.6%
62021
 
1.6%
71794
 
1.4%
81751
 
1.4%
91566
 
1.2%
Other values (462)37370
28.9%
ValueCountFrequency (%)
072753
56.2%
12747
 
2.1%
22587
 
2.0%
32442
 
1.9%
42373
 
1.8%
52083
 
1.6%
62021
 
1.6%
71794
 
1.4%
81751
 
1.4%
91566
 
1.2%
ValueCountFrequency (%)
15841
< 0.1%
12801
< 0.1%
11151
< 0.1%
10111
< 0.1%
9701
< 0.1%
9521
< 0.1%
9401
< 0.1%
9241
< 0.1%
9201
< 0.1%
8601
< 0.1%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexsatisfactionGenderCustomer TypeAgeType of TravelClassFlight DistanceSeat comfortDeparture_Arrival time convenientFood and drinkGate locationInflight wifi serviceInflight entertainmentOnline supportEase of Online bookingOnboard serviceLeg room serviceBaggage handlingCheckin serviceCleanlinessOnline boardingDeparture Delay in MinutesArrival Delay in Minutes
00satisfiedFemaleLoyal Customer65Personal TravelEco2650002242330353200.0
11satisfiedMaleLoyal Customer47Personal TravelBusiness246400030223444232310305.0
22satisfiedFemaleLoyal Customer15Personal TravelEco21380003202233444200.0
33satisfiedFemaleLoyal Customer60Personal TravelEco6230003343110141300.0
44satisfiedFemaleLoyal Customer70Personal TravelEco3540003434220242500.0
55satisfiedMaleLoyal Customer30Personal TravelEco18940003202254554200.0
66satisfiedFemaleLoyal Customer66Personal TravelEco227000325555055531715.0
77satisfiedMaleLoyal Customer10Personal TravelEco18120003202233454200.0
88satisfiedFemaleLoyal Customer56Personal TravelBusiness730003535440154400.0
99satisfiedMaleLoyal Customer22Personal TravelEco1556000320222453423026.0

Last rows

df_indexsatisfactionGenderCustomer TypeAgeType of TravelClassFlight DistanceSeat comfortDeparture_Arrival time convenientFood and drinkGate locationInflight wifi serviceInflight entertainmentOnline supportEase of Online bookingOnboard serviceLeg room serviceBaggage handlingCheckin serviceCleanlinessOnline boardingDeparture Delay in MinutesArrival Delay in Minutes
129477129869satisfiedFemaledisloyal Customer55Personal TravelEco19535254155111334100.0
129478129870satisfiedFemaledisloyal Customer70Personal TravelEco1674545155553245455446.0
129479129871satisfiedFemaledisloyal Customer35Personal TravelEco32875453252245443290.0
129480129872satisfiedFemaledisloyal Customer69Personal TravelEco22405453454454434440.0
129481129874satisfiedFemaledisloyal Customer11Personal TravelEco27525552252235354250.0
129482129875satisfiedFemaledisloyal Customer29Personal TravelEco17315553252233444200.0
129483129876dissatisfiedMaledisloyal Customer63Personal TravelBusiness208723242113233121174172.0
129484129877dissatisfiedMaledisloyal Customer69Personal TravelEco232030333224434232155163.0
129485129878dissatisfiedMaledisloyal Customer66Personal TravelEco245032323223323212193205.0
129486129879dissatisfiedFemaledisloyal Customer38Personal TravelEco430734333334555333185186.0